Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Analysis of Bi-directional Reranking Model for Uyghur-Chinese Neural Machine Translation

ZHANG Xinlu, LI Xiao, YANG Yating, WANG Lei, DONG Rui

Acta Scientiarum Naturalium Universitatis Pekinensis 2020, 56 (1): 31-38. DOI: 10.13209/j.0479-8023.2019.093

Abstract （1291）

HTML

PDF（pc）（899KB）（188）

Save

The fitting training of neural machine translation is easy to fall into a local optimal solution on a lowresource corpus such as Uyghur to Chinese, resulting in the translation result of a single model may not be a global optimal solution. In order to solve this problem, the probability distribution predicted by multiple models is effectively integrated through the ensemble strategy, and multiple translation models are taken as a whole. At the same time, the translation models with opposite decoding directions are integrated by the reordering method based on cross entropy, and the candidate translation with the highest comprehensive score is selected as the output. The experiment on CWMT2015 Uighur-Chinese parallel corpus shows that proposed method has 4.82 BLEU values improvement compared with a single transformer model.

Related Articles | Metrics | Comments（0）

Select

Collaborative Analysis of Uyghur Morphology Based on Character Level

Turghun Osman, YANG Yating, Eziz Tursun, CHENG Li

Acta Scientiarum Naturalium Universitatis Pekinensis 2019, 55 (1): 47-54. DOI: 10.13209/j.0479-8023.2018.067

Abstract （953）

HTML

PDF（pc）（1060KB）（228）

Save

The Uyghur language has various inflectional affixes, complex structures and phonetic changes. The authors propose a collaborative analysis method for Uyghur morphology at character level. It includes three procedures: morpheme segmentation, morphological annotation and reduction of phonetic changes. The main characteristics of this method is to use a composite tag to represent the morpheme boundaries, annotations and phonetic changes. In addition, character sequence annotation is used to train the model. Experimental results show that the accurency of morpheme segmentation, morphological annotation and reduction of phonetic reaches 96.39%, 92.78% and 99.79% respectively. The overall accuracy of the system reaches 92.59%.

Related Articles | Metrics | Comments（0）